Goto

Collaborating Authors

 capturing long-range sequence dependency


Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Neural Information Processing Systems

Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) --- a novel architectural component inspired by adaptive batch normalization and its extensions --- that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution.



Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Neural Information Processing Systems

Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) --- a novel architectural component inspired by adaptive batch normalization and its extensions --- that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution.


Reviews: Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Neural Information Processing Systems

Two or three relevant citations: Transformer models should probably be mentioned in the section on "models designed specifically for use on sequences", since they are competing heavily with the referenced baselines on NLP tasks especially. I believe your numbers on the Yelp dataset compare very favorably to the "sentiment neuron" work from Radford et al https://arxiv.org/abs/1704.01444 - that could be a nice addition and add further external context to your results. Some questions about the architecture, particularly the importance of the "additive skip connection" from input to output - how crucial is this connection, since it somewhat allows the network to bypass the TFiLM layers entirely? Does using a stacked skip (with free trainable parameters) still work, or does it hurt network training / break it completely? What is the SNR of the cubic interpolation used as input for the audio experiments?


Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Neural Information Processing Systems

Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) --- a novel architectural component inspired by adaptive batch normalization and its extensions --- that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution.


Temporal FiLM: Capturing Long-Range Sequence Dependencies with Feature-Wise Modulations.

Birnbaum, Sawyer, Kuleshov, Volodymyr, Enam, Zayd, Koh, Pang Wei W., Ermon, Stefano

Neural Information Processing Systems

Learning representations that accurately capture long-range dependencies in sequential inputs --- including text, audio, and genomic data --- is a key problem in deep learning. Feed-forward convolutional models capture only feature interactions within finite receptive fields while recurrent architectures can be slow and difficult to train due to vanishing gradients. Here, we propose Temporal Feature-Wise Linear Modulation (TFiLM) --- a novel architectural component inspired by adaptive batch normalization and its extensions --- that uses a recurrent neural network to alter the activations of a convolutional model. This approach expands the receptive field of convolutional sequence models with minimal computational overhead. Empirically, we find that TFiLM significantly improves the learning speed and accuracy of feed-forward neural networks on a range of generative and discriminative learning tasks, including text classification and audio super-resolution.